The endoplasmic reticulum stress-related genes and molecular typing predicts prognosis and reveals characterization of tumor immune microenvironment in lung squamous cell carcinoma

Background Endoplasmic reticulum stress (ERS) acts critical roles on cell growth, proliferation, and metastasis in various cancers. However, the relationship between ERs and lung squamous cell carcinoma (LUSC) prognoses still remains unclear. Methods The consensus clustering analysis of ERS-related genes and the differential expression analysis between clusters were investigated in LUSC based on TCGA database. Furthermore, ERS-related prognostic risk models were constructed by LASSO regression and Cox regression analyses. Then, the predictive effect of the risk model was evaluated by Kaplan–Meier, Cox regression, and ROC Curve analyses, as well as validated in the GEO cohort. According to the optimal threshold, patients with LUSC were divided into high- and low- risk groups, and somatic mutations, immune cell infiltration, chemotherapy response and immunotherapy effect were systematically analyzed. Results Two ERS-related clusters were identified in patients with LUSC that had distinct patterns of immune cell infiltration. A 5-genes ERS-related prognostic risk model and nomogram were constructed and validated. Kaplan–Meier curves and Cox regression analysis showed that ERS risk score was an independent prognostic factor (p < 0.001, HR = 1.317, 95% CI = 1.159–1.496). Patients with low-risk scores presented significantly lower TIDE scores and significantly lower IC50 values for common chemotherapy drugs such as cisplatin and gemcitabine. Conclusion ERS-related risk signature has certain prognostic value and may be a potential therapeutic target and prognostic biomarker for LUSC patients. Supplementary Information The online version contains supplementary material available at 10.1007/s12672-024-00887-4.


Introduction
Lung cancer is the common and most lethal malignancy worldwide [1].According to the 2020 global cancer statistics, lung cancer ranks first in mortality and second in morbidity, respectively [2].Except lung adenocarcinoma, lung squamous cell carcinoma (LUSC) is the most common histologic subtype of lung cancer, accounting for approximately 30% of all cases [3].Despite great progress in the diagnosis and treatment of LUSC, the 5-year survival rate of patients with LUSC remains low [4].Therefore, discovering promising diagnostic and prognostic biomarkers or drug targets is an urgent medical need.
The endoplasmic reticulum (ER) is the largest organelle in eukaryotic cells and is the main site of protein synthesis, processing and transportation [5].ERS is a status imbalance of ER homeostasis caused by the accumulation of unfolded or misfolded proteins and changes of Ca + concentration [6].As an important organelle, dysfunction of the ER has important implications for several cellular biological processes, such as lipid synthesis, protein folding, calcium storage, and signal transduction [7].Furthermore, previous studies demonstrated that ERS was involved in the occurrence and malignant progression of various cancers such as lung cancer [8][9][10].The continuous activation of ERS sensors enhances the tumorigenicity and metastasis potential of malignant cells, allowing them to acquire resistance through their mediated dormancy and immunosuppression, and ultimately survive chemotherapy [8,11,12].However, ERS acts as an oncogenic factor only at moderate levels, while uncontrolled stress may also lead to cell death [13].Intracellular reactive oxygen species (ROS) mediated ERS suppresses tumors in lung cancer cells [10].Furthermore, the ERS pathway can be augmented by neutrophil arginase-1, which is discharged from activated apoptotic human neutrophils, and triggers apoptosis in cancer cells [14].Therefore, it was reasonable to believe that ERS may be a valuable target for the treatment of malignant tumors.However, the detailed mechanism and clinical prognostic relationship of ERS in the occurrence and progression of LUSC are still unclear.Identification of certain potential biomarkers related to the occurrence and progression of LUSC would eventually contribute to better understanding of tumor development and further identification of candidate therapeutic targets.
In light of these backgrounds, we explored the patterns of ERS-related genes and the prognostic value of these genes in clinical practice based on TCGA dataset.We developed an independent prognostic ERS-related risk signature and validated it in the GEO cohort.Furthermore, this signature was used to predict tumor immune cell infiltration, chemotherapy response, and immunotherapy effect and discover potential small molecule drugs.This study might provide scientific foundation for exploring new prognostic biomarkers and treatments of LUSC.

Data collection and processing
The Cancer Genome Atlas (TCGA) database (https:// portal.gdc.cancer.gov/) was utilized to obtain RNA-sequencing data and clinical information.Additionally, the Gene Expression Omnibus (GEO) database (https:// www.ncbi.nlm.nih.gov/ geo/) was accessed to procure the GSE37745 dataset.The TCGA dataset comprised 49 normal lung tissues and 496 LUSC samples, from which gene expression profiles were extracted.The inclusion criteria were as follows: (1) samples were collected from patients with LUSC, and (2) corresponding clinical follow-up data were available.The exclusion criteria for the samples were as follows: (1) patients with no follow-up time, no survival status, and other clinical follow-up information missing; (2) patient samples with clinical data but no corresponding RNA-Seq data; (3) samples followed up for less than 30 days.After screening, this study included 469 LUSC samples along with corresponding clinical information (Additional file 2: Fig. S2).GSE37745 contained information of 66 LUSC samples.

Consensus clustering analysis of ERS-related gene
Two ERS-related gene sets (GO RESPONSE TO ENDOPLASMIC RETICULUM STRESS and GO REGULATION OF RESPONSE TO ENDOPLASMIC RETICULUM STRESS) were downloaded from the Molecular Signature Database (MSigDB) v7.5 [15] (http:// www.gsea-msigdb.org/ gsea/ msigdb/).A total of 260 ERS-related genes were obtained by merging and deduplication.Among them, 240 genes could be found in the TCGA dataset.Utilizing the "ConsensusClusterPlus" R package, we divided the TCGA dataset into two distinct clusters and analyzed the consensus cumulative distribution function (CDF) curves to determine the optimal number of clusters.

Analysis of differential gene expression
The TCGA gene sequencing data were analyzed by using "Deseq2" [16] package, set thresholds as |log2FC|> 1.0 and Padj < 0.05 to identify differential genes in C1 and C2.Heatmaps were plotted by using the "pheatmap" package based on the results of differential gene expression analysis.

Immunogenomic landscape analysis
The CIBERSORT algorithm was employed to assess variations in the infiltration levels of 22 immune cell types between two groups, namely C1 and C2, corresponding to high-and low-risk categories.The differential expression of eight

Construction and validation of prognostic risk model related to ERS
Univariate cox analysis was used to assess the relationship between differentially expressed genes (DEGs) of C1 and C2 clusters and overall survival (OS) of LUSC patients.The TCGA dataset was randomly divided in a 1:1 ratio into two groups, namely the train set and the test set.Using Lasso regression and multivariate cox analysis, prognostic risk models associated with ERS were developed in the train group.The risk score for each sample was calculated as following formula: Risk score = Σ(Expi × Coef ).In this formula, xi was the relative expression value of each selected gene and βi was the coefficient obtained from LASSO analysis.For validation, the same genes, regression coefficients and formula were applied in GEO dataset and test group to calculate the risk score.Subsequently, patients with LUSC were categorized into high-and low-risk groups based on the optimal threshold of risk score, and the OS was compared through K-M analysis.To validate the sensitivity and specificity of the prognostic model, ROC curves were computed.The distribution pattern of each risk group was evaluated through Principal Component Analysis (PCA) using the "stats" package.Nomogram were constructed by integrating clinical parameters and risk scores, and the predictive performance of nomogram, risk scores, and other clinical parameters was assessed by analyzing the ROC and the decision curve analysis (DCA) curves.

Functional analysis
We conducted gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses, which included three key aspects that describe biological function: molecular function (MF), cellular component (CC), and biological process (BP).Padj < 0.05 for GO pathway and KEGG pathway were considered to be statistically significant.The relevant pathways and molecular mechanisms were assessed by downloading c2.cp.kegg.v7.5.1.symbols.gmtfrom the GSEA online web site and visualize the first five pathways, and Padj < 0.05 were considered to be statistically significant.

Somatic mutation analysis
The "maf" data of somatic mutations in LUSC samples were obtained from the TCGA GDC database.Mutated genes were summarized and visualized using the "Maftools" R package.

Statistical analysis
Statistical analysis was conducted using R Studio software (version 4.1.1).Survival distribution was estimated using K-M survival curve analysis.Prognostic value of stress-related characteristics of ERS was evaluated through Cox regression analysis, nomogram model, and time ROC curve analysis.All p-value < 0.05 were considered to be statistically significant.

Identification of immune cell infiltration for the two subtypes
The CIBERSORT algorithm was used to study the infiltration level of 22 immune cell types.The results showed that the infiltration level of native B cells, activated mast cells and eosinophils was higher in the C2 subtype than the corresponding infiltration levels in the C1 subtype (Fig. 2A, B).In contrast, it showed lower infiltration levels of resting memory CD4 T cells, activated memory CD4 T cells, regulatory T cells (Tregs), macrophages M1, mast cells resting, and neutrophils in C2 cluster than in C1 cluster.The expression levels of 8 inhibitory immune checkpoints were examined to better analyze the tumor immune microenvironment (TIME) of both subtypes.Except for the expression level of CD274, which did not differ, the expression of the other seven suppressive immune checkpoints were found to be significantly up-regulated in C1 than that in C2 (Fig. 2C).

Functional enrichment analysis of two clusters DEGs
GO and KEGG analyses were used to investigate the potential biological functions and pathways of DEGs.These DEGs were enriched in immune-related biological processes and molecular functions, such as leukocyte mediated immunity, positive regulation of leukocyte activation, activation of immune responses, immunoglobulin complex, and immune receptor activity (Fig. 2D).KEGG analyses results also showed enrichment of immune-related pathways, include cytokinecytokine receptor interaction, cell adhesion molecules, phagosome, and chemokine signaling pathway (Fig. 2E).

Construction and validation of risk models
To further investigate the prognostic features between distinct clusters, we performed Lasso-Cox regression analysis on 1243 DEGs between the two subtypes and constructed a prognostic risk model related to ERS (Fig. 3A-C).The expression of five genes used to construct the prognostic risk model was shown in the heatmap (Additional file 1: Fig. S1A).The risk score was calculated by the following formula: 0.1628*ExpressionFLNC − 0.1216*ExpressionGLYATL2 + 0.1432*Expres-sionKLK6-0.1462*ExpressionPLAAT1 + 0.1427*ExpressionPTGIS.LUSC patients were divided into two risk groups according to the optimal threshold of risk score (Fig. 3D).Results from PCA and t-SNE indicated that LUSC samples belonging to different risk groups were distributed in distinct regions (Additional file 1: Fig. S1B, C).Correlation analysis of risk scores and survival status showed that higher risk score was associated with higher mortality (Fig. 3D).Patients in the high-risk group had a worse prognosis than those in the low-risk group (p < 0.001) (Fig. 3E).The Sankey diagram showed that most C1 samples were in the high-risk group and had a poor prognosis, validating the results of the previous survival analysis (Fig. 3F, G).The areas under ROC curves for 1, 3 and 5 years were 0.639, 0.713 and 0.720 (Fig. 3H), respectively, which indicates that the model has good prediction efficiency.The application of the risk model showed significant

Prediction of chemotherapy effects by the risk signature
First, data on the response of patients in high-and low-risk to common chemotherapeutic drugs were downloaded from the GDSC database.The results showed that many common chemotherapeutic drugs for LUSC differed significantly between the two groups (Fig. 7A).Gemcitabine and Cisplatin, as the most common chemotherapy medicines for LUSC, exhibited significantly lower IC50 values in the low-risk group than in the high-risk group (p < 0.05), indicating that Gemcitabine and Cisplatin might be more effective in the low-risk group.Nevertheless, the effect of these drugs in the treatment of LUSC patients remains to be further proven in future clinical studies.The relationship between model genes and drug sensitivity was analyzed from the cellMiner database, and the scatterplot of the top 16 drugs with the highest correlation between genes and drug sensitivity was drawn (Fig. 7B).A positive correlation indicates that higher gene expression is more sensitive to drugs, while a negative correlation indicates that higher gene expression is more resistant to drugs.Among them, the up-regulation of FLNC and PTGIS expression may lead to enhanced sensitivity of patients to most drugs.These findings imply that changes in the expression of genes constructing ERS-related signature may be useful in predicting medication response and serving as prospective therapeutic targets.

Prediction of potential drugs
We analyzed DEGs between the high-and low-risk groups to screen for novel therapeutic compounds for the treatment of LUSC.Among the 99 DEGs studied, there were 64 up-regulated genes and 35 down-regulated genes.As shown in (tracazolate, AT-9283, altizide, amylocaine, westcort, carbenoxolone, dovitinib and SAL-1) and their corresponding mechanisms of action (MoA) were explored as well.Among them, AT-9283 acts through 7 kinds of MoAs.Dovitinib acts through epidermal growth factor receptor (EGFR) inhibitors, fibroblast growth factor receptor (FGFR) inhibitors, platelet-derived growth factor receptor (PDGFR) inhibitors, and vascular endothelial growth factor receptor (VEGFR) inhibitors.The 2D chemical structures of each small molecule compound were also downloaded and are shown in Fig. 7C.

Discussion
Under the conditions of protein homeostasis in normal cells, sensors of ERS, including ATF6, IRE1a and PERK are inactivated, whereas in the tumor microenvironment (TME), multiple factors such as hypoxia [18], abnormal nutrient supply [19] and ROS accumulation [20] interfere with protein folding in the ER, thereby driving ERS in cancer cells.It was found that ERS was closely associated with the development of lung cancer [10].For example, previous study showed that ERS promoted the process of epithelial-mesenchymal transition and cell invasion in LUAD [21].Shu et al. found that ERS was associated with the efficacy of targeted therapy and immunotherapy for LUAD [22].In addition, cancer cells could acquire resistance and survive chemotherapy through ERS-mediated dormancy and immunosuppression [8,11,12].Further studies found that ERS was associated with cisplatin resistance in lung cells [23].However, no studies have investigated the relationship between ERS-related genes and LUSC prognosis.Thus, we speculate that ERS has been largely overlooked as a novel prognostic marker for LUSC.In this work, we present several new findings: To our knowledge, this is the first time that an ERS-derived prognostic model has been developed for evaluating the prognosis of patients with LUSC.Second, we validated this model using the independent GSE37745 cohort.Importantly, we assessed the responses of different clusters or risk populations to commonly used medications for LUSC and predicted several potential small-molecule compounds.We incorporated a comprehensive panel of ERS genes into our analysis of LUSC, which aims to enhance our comprehension of the involvement of ERS in the TME of LUSC and potentially facilitate the advancement of novel therapeutic approaches.The ERS-related risk model showed strong predictive effects in both the independent test set and the external validation set, which indicated that our risk model was reliable and effective.The result of the correlation between the risk score and the TIME showed that the immune scores of patients in the high-risk group were significantly higher than that in the low-risk group, and the abundance of tumor immune cells infiltration was also significantly different between the two groups.Based on our findings, the collective expression pattern of ERS-related genes appears to play a critical role in facilitating cancer cell adaptation and growth, ultimately resulting in an unfavorable prognosis for patients.Thus, ERS as a reliable predictor of LUSC prognosis and response to immunotherapy might provide valuable insights into the establishment of effective LUSC therapies.
Complementary studies targeting ERS and modulating the immunosuppressive TME component will become promising solid tumor therapies that target ERS-based LUSC.We found that C1 subtype samples had poorer survival than C2, suggesting that ERS-related gene patterns might be involved in tumor progression.In this work, we assessed the percentage of various cells in the TIME using CIBERSORT and ESTIMATE revealing that the C1 cluster population presented an immunosuppressive microenvironment landscape, which could be one of the reasons for treatment resistance in these patients.Moreover, the two subtypes had different levels of immune cell infiltration, which might be related to the significantly upregulated levels of Tregs and neutrophil infiltration observed in C1.Tregs, which are potent immunosuppressive cells in the immune system, can promote cancer progression by suppressing antitumor immunity [24].They were heavily infiltrated in various tumors and often associated with poor clinical outcomes [25].Neutrophils, on the other hand, are known to be involved in all stages of carcinogenesis [26].High levels of neutrophil infiltration were related with poor survival in a range of malignancies, including lung cancer [27,28].In addition, the function of differential genes between the two clusters was enriched in immune-related and other functions.Furthermore, we analyzed the inhibitory immune checkpoint genes such as programmed cell death protein 1 (PD-1), cytotoxic T lymphocyte protein 4 (CTLA4), lymphocyte activation gene 3 protein (LAG3), and T cell immunoreceptor with immunoglobulin and ITIM domain (TIGIT), which participate in T cell dysfunction and malfunction of anti-tumor immunity [29], and discovered that they were significantly up-regulated in the C1 cluster.In the two subtypes mentioned above, CD8 + T cell presence was not statistically significant.Despite the fact that CD8 + T cell usually regarded as a positive regulation of anti-tumor immunity and signs of a good prognosis, and considering the TME as a complicated and elaborate regulation network, we speculate that the overall effect of ER stress is closely correlated with T cell exhaustion and formation of an immunosuppressive TME in LUSC.This is consistent with previous reports that PERK pathway is involved in the immunosuppressive function of MDSC [30].These results provide further evidence that ERS might indeed play a pivotal role in the regulation of TME.A high expression level of inhibitory immune checkpoint genes hampers the immune response and facilitates the immune escape process in cancers.Presently, immune checkpoint inhibitors targeting CTLA4, LAG3, PD-1, and PD-1 ligand 1 (PD-L1) have been approved for clinical treatment, and some progress has been achieved in this field [31][32][33].Together, these data are important for optimizing personalized, targeted treatment strategies for different sub-clusters.
Among these model genes (FLNC, KLK6, PTGIS, PLAAT1 and GLYATL2), FLNC is an actin-binding protein that regulates the actin cytoskeleton in cells and is involved in cancer metastasis.As for the exact effect of FLNC on the prognosis of cancers, previous studies reported inconsistent results.Over-expression of FLNC in gastric and prostate cancers inhibited the proliferation and metastasis of tumor cells, and it was associated with a good prognosis [34].However, several studies indicated that FLNC might promote tumor progression, migration and invasion [35,36].For instance, it was demonstrated that FLNC enhances the aggressiveness of glioblastoma cells by inducing MMP2 activation, leading to poor prognosis [35].High expression of FLNC promoted lymphatic invasion and metastasis of esophageal squamous cell carcinoma by regulating Rho GTPase [36].Therefore, this study system analyzed the role of FLNC expression in LUSC patients.The result suggested a significant association between high FLNC expression and poorer OS in LUSC, which indicated that overexpression of FLNC might contribute to poor prognosis.
KLK6 is a secreted serine protease, which promotes the production of TNF-α by macrophages through PAR1 in the TME to stimulate the production of CXCL1 in tumor cells, thereby promoting tumor growth and metastasis [37].In non-small cell lung cancer, high expression of KLK6 was an indicator of tumor proliferation and poor prognosis [38]; KLK6 was possible to be a potential novel diagnostic biomarker for LUSC [39].These previous studies supported our results that high expression of KLK6 was associated with poor prognosis in patients with LUSC.
PTGIS and PLAAT1 might serve as new therapeutic targets for LUSC as well as biomarkers for prognosis and tumor immunity [40].PLAAT1 (HRASLS) regulated cell proliferation, tumor suppression, and phospholipid metabolism.[41].Furthermore, studies have demonstrated its ability to promote the degradation of lens organelles such as the ER and lysosomes [42].PTGIS is a member of the cytochrome P450 family and it is a key gene for synthesis of PGI2 and thus could indirectly affect normal inflammatory responses activation and differentiation of immune cell by catalyzing PGI2 [43,44].Dai et al. found that high expression of PTGIS promoted infiltration of TAMs and Tregs in the TME and worsened the prognosis of patients with lung, ovarian and gastric cancers [45].In ovarian cancer, it was driven by PGI2/PTGIR to induce pro-tumor and immunosuppression [46].Furthermore, Lei et al. discovered in his experimental study that contrary to PLAAT1's inhibition of proliferation, migration and invasion of LUSC, PTGIS promoted that [40].This was consistent with our study and further demonstrated the reliability of the risk model we constructed.
GLYATL2 is a member of the glycine-binding enzyme gene family that localizes to the ER and belongs to the class of ER-associated proteins.The N-acyl glycine produced was anti-inflammatory and anti-proliferative, and its high level of expression in the skin and lung might indicated an important role in barrier function and immune response [47].Given this, we speculated that the observed association between elevated GLYATL2 expression and good prognosis in LUSC might be due to its biological functions.Whereas four of the five model genes were involved in the progression of multiple cancers, including LUSC, and were not only significantly associated with patient survival and prognosis, but their expression patterns were also associated with tumorigenesis and regulation of the TIME, which indicated that the results of our bioinformatics analysis were meaningful to a certain extent.Therefore, the ERS risk model might well predict the prognosis of LUSC and the immunotherapy efficacy.
Assessing the comprehensive characterization of immune-related genes and pathways helps to understand the association between cancer stemness and TME in lung cancer [48].Functional analysis showed that high-risk populations were mainly involved in immune-related functions and signaling pathways, suggesting an interaction between ERS and LUSC immune response.Chemokines were found to be involved in promoting tumor heterogeneity by maintaining or promoting a stem cell-like phenotype.CXCR4/CXCL12 interactions contributed to the promotion of tumor-initiating cells in lung cancer, which was associated with chemotherapy resistance [49].It has been shown that ERS could play an important role in tumor development through its immunomodulatory function [10].Accordingly, we found that although patients in the high-risk group had a high immune infiltration profile, they also had a large number of immunosuppressive cells, such as Treg cells and M2 subtype macrophages.This was consistent with the immunosuppressive role of ERS in cancer reported by Cubillos-Ruiz et al. [8].Compared to the low-risk group, the high expression of suppressive immune checkpoints and the enrichment of tumor immunosuppressive cells in the high-risk group suggest that the model successfully differentiates the immune type of LUSC.This suggested that ERS could regulate the immune microenvironment of LUSC and thus affect the prognosis of LUSC patients.Meanwhile, TIDE analysis showed that the TIDE score in the high-risk group was significantly higher than that in the low-risk group, indicating that ICI treatment was less effective in highrisk patients.Notably, the higher frequency of mutations in the low-risk group also suggests that they may be better able to benefit from immunotherapy [50].In addition, we further analyzed the differences in the efficacy of commonly used chemotherapy drugs in the high-and low-risk groups and found that patients in the low-risk group were more sensitive to common chemotherapy drugs such as cisplatin.These results suggested that ERS characteristics might also be a reasonable and effective method for screening patients receiving chemotherapy.Thus, these findings suggested that the risk model we constructed could be used as an effective indicator to assess the response of LUSC patients to chemotherapy and immunotherapy and might provide useful advice for the future treatment of these patients.
Nevertheless, our study existed some limitations that should be acknowledged.Firstly, the retrospective design of our investigation called for further validation of the prognostic risk model through prospective clinical studies involving larger sample sizes.Secondly, our analysis, and hence conclusions, are based on data obtained from public databases, which may accordingly have led to inherent case selection bias.In addition, although we extensively discussed the prognostic value and biological implications of ERS in LUSC, their specific molecular mechanisms of action in LUSC still need to be verified by in vivo and in vitro experiments.Finally, we predicted several latent compounds for targeting LUSC patients, whose toxicological responses should also be considered in future studies.

Conclusion
We categorized LUSC samples into two clusters based on their association with ERS and varying levels of prognosis and immune infiltration, and constructed risk models with good accuracy based on the DEGs between the two clusters.Our analysis revealed distinct immune characteristics between the high-risk and low-risk groups.Therefore, our findings might provide an important reference for clinicians in clinical diagnosis, prognostic analysis and achieving individualized comprehensive treatment of LUSC.

Fig. 1
Fig. 1 ERS-related clusters of LUSC.A Consensus cluster matrix of LUSC samples when k = 2. B CDF curves for k = 2-9.C Delta area for k = 2-9 in the consistent clustering model.D Heatmap of ERS-related genes and distribution of clinical parameters between two clusters.E PCA for two ERS-related clusters.F OS of two clusters.The IC50 values of (G) Gemcitabine, (H) Paclitaxel, (I) Afatinib, and (J) Gefitinib in two clusters.*p < 0.05

Fig. 4
Fig. 4 Construction and validation of a nomogram for predicting OS in TCGA dataset.A, B Univariate and multivariate regression analysis.C Construction of a nomogram based on age, gender, stage and risk score.D Verification of the predictive accuracy of the nomogram by calibration curves.E-G ROC curves of the nomogram, risk score, and clinical parameters for predicting the 1-, 3-and 5-year OS.H-J DCA curves for comparing the net survival benefit of the nomogram, risk score, and clinical parameters ▸

Table 1 ,
the top 8 potential novel therapeutic small molecule compounds with the highest negative scores were screened